Adventures in Multilingual Parsing

نویسنده

  • Joakim Nivre
چکیده

The typological diversity of the world’s languages poses important challenges for the techniques used in machine translation, syntactic parsing and other areas of natural language processing. Statistical models developed and tuned for English do not necessarily perform well for richly inflected languages, where larger morphological paradigms and more flexible word order gives rise to data sparseness. Since paradigms can easily be captured in rule-based descriptions, this suggests that hybrid approaches combining statistical modeling with linguistic descriptions might be more effective. However, in order to gain more insight into the benefits of different techniques from a typological perspective, we also need linguistic resources that are comparable across languages, something that is currently lacking to a large extent. In this talk, I will report on two ongoing projects that tackle these issues in different ways. In the first part, I will describe techniques for joint morphological and syntactic parsing that combines statistical dependency parsing and rule-based morphological analysis, specifically targeting the challenges posed by richly inflected languages. In the second part, I will present the Universal Dependency Treebank Project, a recent initiative seeking to create multilingual corpora with morphosyntactic annotation that is consistent across languages.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Neural Architectures for Multilingual Semantic Parsing

In this paper, we address semantic parsing in a multilingual context. We train one multilingual model that is capable of parsing natural language sentences from multiple different languages into their corresponding formal semantic representations. We extend an existing sequence-to-tree model to a multi-task learning framework which shares the decoder for generating semantic representations. We ...

متن کامل

Adapting Multilingual Parsing Models to Sinica Treebank

This paper presents our work for participation in the 2012 CIPS-SIGHAN shared task of Traditional Chinese Parsing. We have adopted two multilingual parsing models – a factored model (Stanford Parser) and an unlexicalized model (Berkeley Parser) for parsing the Sinica Treebank. This paper also proposes a new Chinese unknown word model and integrates it into the Berkeley Parser. Our experiment gi...

متن کامل

A Two-Stage Parser for Multilingual Dependency Parsing

We present a two-stage multilingual dependency parsing system submitted to the Multilingual Track of CoNLL-2007. The parser first identifies dependencies using a deterministic parsing method and then labels those dependencies as a sequence labeling problem. We describe the features used in each stage. For four languages with different values of ROOT, we design some special features for the ROOT...

متن کامل

A System for Multilingual Dependency Parsing based on Bidirectional LSTM Feature Representations

In this paper, we present our multilingual dependency parser developed for the CoNLL 2017 UD Shared Task dealing with “Multilingual Parsing from Raw Text to Universal Dependencies”1. Our parser extends the monolingual BIST-parser as a multi-source multilingual trainable parser. Thanks to multilingual word embeddings and one hot encodings for languages, our system can use both monolingual and mu...

متن کامل

Multilingual Semantic Parsing : Parsing Multiple Languages into Semantic Representations

We consider multilingual semantic parsing – the task of simultaneously parsing semantically equivalent sentences from multiple different languages into their corresponding formal semantic representations. Our model is built on top of the hybrid tree semantic parsing framework, where natural language sentences and their corresponding semantics are assumed to be generated jointly from an underlyi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014